Discovering Novelty in Gene Data: From Sequential Patterns to Visualization

نویسندگان

  • Arnaud Sallaberry
  • Nicolas Pecheur
  • Sandra Bringay
  • Mathieu Roche
  • Maguelonne Teisseire
چکیده

Data mining techniques allow users to discover novelty in huge amounts of data. Frequent pattern methods have proved to be efficient, but the extracted patterns are often too numerous and thus difficult to analyse by end-users. In this paper, we focus on sequential pattern mining and propose a new visualization system, which aims at helping end-users to analyse extracted knowledge and to highlight the novelty according to referenced biological document databases. Our system is based on two visualization techniques: Clouds and solar systems. We show that these techniques are very helpful for identifying associations and hierarchical relationships between patterns among related documents. Sequential patterns extracted from gene data using our system were successfully evaluated by two biology laboratories working on Alzheimers disease and cancer.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering gene-gene relations from sequential sentence patterns in biomedical literature

In this paper, we have developed a gene–gene relation browser (DiGG) that integrates sequential pattern-mining and informationextraction model to extract from biomedical literature knowledge on gene–gene interactions. DiGG combines efficient mining technique to enable the discovery of frequent gene–gene sequences even for very long sentences. Our approach aims to detect associated gene relation...

متن کامل

Efficient Discovering of Top-K Sequential Patterns in Event-Based Spatio-Temporal Data

We consider the problem of discovering sequential patterns from event-based spatio-temporal data. The dataset is described by a set of event types and their instances. Based on the given dataset, the task is to discover all significant sequential patterns denoting some attraction relation between event types occurring in a pattern. Already proposed algorithms discover all significant sequential...

متن کامل

On Novelty Evaluation of Potentially Useful Patterns

As is generally accepted, the most important feature that a Knowledge Discovery in Database (KDD) system must possess is, to be able to discover patterns that are “novel” and “potentially useful”. In order to allow KDD systems to make novelty and potential usefulness judgment, we extend our former work on discovering “potentially useful” patterns by proposing a formal definition of “novelty” ba...

متن کامل

Mining Temporally-Interesting Learning Behavior Patterns

Identifying sequential patterns in learning activity data can be useful for discovering, understanding, and ultimately scaffolding student learning behaviors in computer-based learning environments. Algorithms for mining sequential patterns generally associate some measure of pattern frequency in the data with the relative importance or ranking of the pattern. However, another important aspect ...

متن کامل

Constraint-based sequential pattern mining: a pattern growth algorithm incorporating compactness, length and monetary

Sequential pattern mining is advantageous for several applications for example, it finds out the sequential purchasing behavior of majority customers from a large number of customer transactions. However, the existing researches in the field of discovering sequential patterns are based on the concept of frequency and presume that the customer purchasing behavior sequences do not fluctuate with ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010